Hi All,
I have done word document comparison tool (windows application) using ASP.NET. I have used Microsoft word object library to read word document. It is working fine. When I started to do the project client wanted to compare documents text only I mean need not to compare formats, Header and footer. So I easily did that using document.text property. Now client wants to compare even header and footers. I am unable to read header and footer in sequential manner. Anybody could help me in this regard.
Note.
I want to read document in following manner
- Read Header if document contain header
- Read paragraph contain text
- Read Footer
Each page have to be read in above mentioned manner.
I can easily read Header, footer and document contains text alone. As I need to compare source document and destination document line by line and expose mismatched record in report format, I need to read the word document line by line. Please help me
Thanks and Regards
Ramadurai Jayaraman
Anonymous User
06-Jul-2019Thanks for the post.
Ely Sanders
19-Aug-2013Hi, I'm not quite sure what your requirements are but here is one simple solution you can try.
- Read DOCX file in C# and store its text content as a string (this will also include header/footer text content)
- Split documents text content to a string array containing document lines.
- Execute string comparison.
string sourceDocumentText = DocumentModel.Load(desktop + "\\Source.docx").Content.ToString();
string destinationDocumentText = DocumentModel.Load(desktop + "\\Destination.docx").Content.ToString();
string[] sourceTextLines = sourceDocumentText.Split(new char[] { '\n', '\r', '\t' },StringSplitOptions.RemoveEmptyEntries);
string[] destinationTextLines = destinationDocumentText.Split(new char[] { '\n', '\r', '\t' },StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < sourceTextLines.Length; i++)
{
if (string.Equals(sourceTextLines[i], destinationTextLines[i]))
continue;
else
{
// Expose mismatched record in report format.
}
}
I used this C# Word component.